home *** CD-ROM | disk | FTP | other *** search
- ARITH.DOC: About arithmetic in the CSTAR translator
-
- June 25, 1991
-
- NOTE: stuff in brackets [ ] pertains to things that need to be done but
- aren't implemented yet.
-
- Arithmetic involving only integers is strictly typed without automatic
- conversions. Mixed-mode expressions must be written with casts; this
- is good practice anyhow, and since C types are slightly less restrictive
- than Pascal types, it is eminently feasible in all cases. Note also that
- longer results will *not* be silently truncated and blithely stuffed
- shorter variables.
-
- If you are writing too many casts, it may pay to use longer forms for
- some of your variables. The 68000 moves words as easily as bytes,
- and the penalties for using long words are fairly modest except
- in multiplication or division.
-
- In order to require an intermediate result longer than its result, a
- mixed-mode expression must contain a division or right shift of a
- quantity which is wider than the result, since those are the only
- operations in which bits can affect other bits to their right. (In
- such cases, CSTAR may generate no code for the cast if it is inherent
- in the operator; the multiplication of two ints to get a long has to
- be written:
-
- long_var = (long)int_1 * (long)int_2;
-
- but it would be silly to extend the integers and call the long
- multiply functions.)
-
- Characters are treated as 8-bit integers and are never automatically
- extended except for use as function arguments. A mixed char/int
- expression should be written exactly like its int/long analog in
- terms of form and of placement of casts. In the absence of casts,
- all char results are 8 bits wide. Thus:
-
- (int)(char_1 * char_2)
-
- means multiply char_1 by char_2 and extend the 8-BIT result to 16 bits.
- This treatment minimizes the generation of silly code in trivial
- comparisons and range checks on characters, which, in most programs,
- is the only use of character arithmetic.
-
- On the 68000, I recommend against using single char variables
- (aggregates of char are a different matter) to "save space"
- if they will appear in mixed-mode expressions, since the code to
- extend them will in all cases occupy more space than the one byte
- per variable so saved.
-
- [In order to avoid emulating the speed of a Vic-20 with your target 68000
- system, FLOAT variables are NOT automatically extended to DOUBLE
- everywhere unless the code is being compiled for the 68881 or
- some such chip.]
-
- Pointer arithmetic involves the addition of a pointer to an integer
- (of any size) or the subtraction of two pointers to yield an integer.
- In CSTAR, it takes advantage of the 68000 address modes under most
- circumstances, in order to get good code.
-
- Pointer addition involves two operands of different types, so it is
- truly mixed mode. In CSTAR, scaling is done to whatever width is
- necessary to correctly hold the result of that scaling. In practice,
- most scaling produces a long result; the idiosyncrasies of the 68000
- A registers assist with that. The result of scaling a char (i.e. an
- 8-bit integer) will be integer if the scale factor is small enough
- for that to be correct.
-
- In a complex expression, the C-language operator which is spelled +
- is not associative, and this may lead to surprises. It is not
- associative because it is polymorphic, and the form it takes--pointer
- result or integer result--depends on its operands and therefore
- on its associations.
-
- The surprise is that in the expression:
-
- int_1 + int_2 + pointer;
-
- the first addition is of the integer result form, and it could overflow
- and provide an incorrect operand to the pointer addition, while
-
- pointer + int_1 + int_2;
-
- does two additions of the pointer result form, and the analogous overflow
- cannot happen. (In both cases, of course, a 24-bit overflow out of the
- 68000 address space could occur.) To repeat: when pointers are involved,
- addition is commutative but not associative.
-
- In the case of array subscripts, the association of the effective additions
- is ambiguous. That is,
-
- p [i + j]
-
- could be treated as *(p + i + j) or as *(p + (i + j)). In order to
- produce the most conservative results, CSTAR uses the first association.
-
- A command-line switch is available to choose the second form instead.
- If it is necessary to rigorously control overflow behavior in an expression,
- (as to maintain a circular buffer in a fast interrupt-service routine
- by allowing the pointer to overflow to zero) the expression should be
- written out with * and +, and eschew [], in order to avoid any dependency
- on how subscript calculations are optimized.
-
- [The DRI compiler is unpredictable and often sets up to do the first association--
- and then truncates the result after scaling! If someone can tell us
- that p[i+j] "really means" one form or the other, we can change the
- default flag setting.]
-
- The 68000 indexed addressing modes use a pointer and a signed, scaled
- integer or constant. The surprise that results is that an unsigned
- word-size pointer has to be extended to long to use it correctly in
- an address mode or to add it correctly to an address register. If
- this is a problem in critical code, we suggest you set up a
- distinctive macro that does an int cast, which generates no code
- itself, and will suppress the extension, and use it in critical code
- if you are *absolutely sure* the cast will not overflow. (If you
- get that wrong, you will be stung by a nasty pointer bug.)
-
- The constant in most addressing modes is limited to 32K, so absolute
- (global/external/static) variables cannot be accessed with those modes; nor
- can frame variables beyond 32K. On those machines whose operating
- systems limit memory chunks to 32K or less anyhow, the latter is not a
- problem. Such long-range accesses may take longer than short-frame
- accesses, and they may do actual addition that would not occur with an
- analogous short-frame access.
-
- Before an integer is used in an address mode, it has to be scaled, unless
- the address is a pointer to char. CSTAR does scaling by 2 by doubling in
- an A register if one is available, or a D register if the variable is
- already in one. Scaling by other powers of 2 is by shifting, and other
- scaling is by multiplication. Scaling a long variable involves a long
- multiplication; be aware that this takes all day (worst case: function
- call and return plus the three mul instructions and miscellany.)
-
- When CSTAR needs an address, or an address has overgrown the complexity
- of a 68000 addressing mode, code is generated to do some sort of real
- addition with ADDA or LEA. For reasonable expressions, this code is
- very close to what you would write in assembler, and occasionally,
- it is better when a trick shows up that you might not have noticed.
- The cases that are only "very close" involve 0(An, Rn), which
- under some circumstances can be replaced by rearranging the surrounding
- code; the result would run two clock cycles faster--and possibly be one
- word longer. [Maybe there should be a peephole switch.]
-
- [Not implemented yet]
-
- In an associated chain of pointer additions (containing one pointer
- operand and two or more integer operands), CSTAR may, at its option,
- postpone the scaling operation and perform it just once. The result will
- be rigorously guaranteed to be identical to what would have been
- obtained had the postponement not taken place. If necessary, extending
- casts and/or A register addition will be generated.
-